The widespread of offensive content online, such as hate speech and cyber-bullying, is a global phenomenon. This has sparked interest in the artificial intelligence (AI) and natural language processing (NLP) communities, motivating the development of various systems trained to detect potentially harmful content automatically. These systems require annotated datasets to train the machine learning (ML) models. However, with a few notable exceptions, most datasets on this topic have dealt with English and a few other high-resource languages. As a result, the research in offensive language identification has been limited to these languages. This paper addresses this gap by tackling offensive language identification in Sinhala, a low-resource Indo-Aryan language spoken by over 17 million people in Sri Lanka. We introduce the Sinhala Offensive Language Dataset (SOLD) and present multiple experiments on this dataset. SOLD is a manually annotated dataset containing 10,000 posts from Twitter annotated as offensive and not offensive at both sentence-level and token-level, improving the explainability of the ML models. SOLD is the first large publicly available offensive language dataset compiled for Sinhala. We also introduce SemiSOLD, a larger dataset containing more than 145,000 Sinhala tweets, annotated following a semi-supervised approach.
translated by 谷歌翻译
评分函数(SF)测量了知识图中三重态的合理性。不同的评分功能会导致在不同知识图上的链接预测性能上造成巨大差异。在本报告中,我们描述了通过在开放图基准(OGB)上随机搜索发现的怪异评分函数。该评分函数(称为Autoweird)仅在三胞胎中使用尾部实体和关系来计算其合理性得分。实验结果表明,AutoweiD在OGBL-Wikikg2数据集上实现了TOP-1性能,但比OGBL-BIOKG数据集的其他方法的性能要差得多。通过分析这两个数据集的尾部实体分布和评估协议,我们将Autoweird在OGBL-Wikikg2上的意外成功归因于不适当的评估和集中的尾巴实体分布。这样的结果可能会激发有关如何准确评估知识图的不同链接预测方法的性能的进一步研究。
translated by 谷歌翻译
计算光学成像(COI)系统利用其设置中的光学编码元素(CE)在单个或多个快照中编码高维场景,并使用计算算法对其进行解码。 COI系统的性能很大程度上取决于其主要组件的设计:CE模式和用于执行给定任务的计算方法。常规方法依赖于随机模式或分析设计来设置CE的分布。但是,深神经网络(DNNS)的可用数据和算法功能已在CE数据驱动的设计中开辟了新的地平线,该设计共同考虑了光学编码器和计算解码器。具体而言,通过通过完全可区分的图像形成模型对COI测量进行建模,该模型考虑了基于物理的光及其与CES的相互作用,可以在端到端优化定义CE和计算解码器的参数和计算解码器(e2e)方式。此外,通过在同一框架中仅优化CE,可以从纯光学器件中执行推理任务。这项工作调查了CE数据驱动设计的最新进展,并提供了有关如何参数化不同光学元素以将其包括在E2E框架中的指南。由于E2E框架可以通过更改损耗功能和DNN来处理不同的推理应用程序,因此我们提出低级任务,例如光谱成像重建或高级任务,例如使用基于任务的光学光学体系结构来增强隐私的姿势估计,以维护姿势估算。最后,我们说明了使用全镜DNN以光速执行的分类和3D对象识别应用程序。
translated by 谷歌翻译
对人类流动性进行建模有助于了解人们如何访问资源并在城市中彼此进行身体接触,从而有助于各种应用,例如城市规划,流行病控制和基于位置的广告。下一个位置预测是单个人类移动性建模中的一项决定性任务,通常被视为序列建模,用Markov或基于RNN的方法解决。但是,现有模型几乎不关注单个旅行决策的逻辑和人口集体行为的可重复性。为此,我们提出了一个因果关系和空间约束的长期和短期学习者(CSLSL),以进行下一个位置预测。 CSLSL利用基于多任务学习的因果结构来明确对“ $ \ rightarrow $ wher wher wher wher whit $ \ rightarrow $ where where where”,a.k.a.”接下来,我们提出一个空间约束损失函数作为辅助任务,以确保旅行者目的地的预测和实际空间分布之间的一致性。此外,CSLSL采用了名为Long and Short-Charturer(LSC)的模块,以了解不同时间跨度的过渡规律。在三个现实世界数据集上进行的广泛实验表明,CSLSL的性能改善了基准,并确认引入因果关系和一致性约束的有效性。该实现可在https://github.com/urbanmobility/cslsl上获得。
translated by 谷歌翻译
Ever since the first microscope by Zacharias Janssen in the late 16th century, scientists have been inventing new types of microscopes for various tasks. Inventing a novel architecture demands years, if not decades, worth of scientific experience and creativity. In this work, we introduce Differentiable Microscopy ($\partial\mu$), a deep learning-based design paradigm, to aid scientists design new interpretable microscope architectures. Differentiable microscopy first models a common physics-based optical system however with trainable optical elements at key locations on the optical path. Using pre-acquired data, we then train the model end-to-end for a task of interest. The learnt design proposal can then be simplified by interpreting the learnt optical elements. As a first demonstration, based on the optical 4-$f$ system, we present an all-optical quantitative phase microscope (QPM) design that requires no computational post-reconstruction. A follow-up literature survey suggested that the learnt architecture is similar to the generalized phase contrast method developed two decades ago. Our extensive experiments on multiple datasets that include biological samples show that our learnt all-optical QPM designs consistently outperform existing methods. We experimentally verify the functionality of the optical 4-$f$ system based QPM design using a spatial light modulator. Furthermore, we also demonstrate that similar results can be achieved by an uninterpretable learning based method, namely diffractive deep neural networks (D2NN). The proposed differentiable microscopy framework supplements the creative process of designing new optical systems and would perhaps lead to unconventional but better optical designs.
translated by 谷歌翻译
围绕着美国的空气是所有生命形式的主要呼吸来源。因此,毫无疑问地强调,均衡的空气质量对所有生物,环境稳定性甚至经济均衡的呼吸健康至关重要。尽管如此,由于汽车和行业进入大气中的污染排放的持续增长,在过去几十年中,在过去几十年中逐渐逐渐劣化。尽管许多人几乎没有承认问题的深度,但是通过促进技术驱动的倡议及时检测和预测,努力将肯定的缔约方(包括世界卫生组织)的持续努力始终如一地推动了一个定性更好的全球空气稳态的界限。区域和全球范围内的空气质量。然而,现有的空气质量监测框架缺乏实时响应能力和灵活的语义分布能力。在本文中,提出了一种新颖的事情互联网,其易于实现,语义分配和由机器学习模型赋予。该建议的系统配备了通过公共空气质量传感器网络获取的加工,可视化和存储主传感器数据的虹彩仪表板,以及仪表板与机器学习模型集成,以获得时间和地理空间空气质量预测。 ESP8266 Nodemcu通过消息排队遥测传输代理作为订户并入到虹红音仪表板中,通过开发的Web和移动应用将定量空气质量数据或警报电子邮件传达给最终用户。因此,拟议的系统可以通过未禁止的,数据驱动和语义框架赋予公众在空气质量方面的公众参与。
translated by 谷歌翻译